Using Vision to Improve Sound Source Separation
نویسندگان
چکیده
We present a method of improving sound source separation using vision. The sound source separation is an essential function to accomplish auditory scene understanding by separating stream of sounds generated from multiple sound sources. By separating a stream of sounds, recognition process, such as speech recognition, can simply work on a single stream, not mixed sound of several speakers. The performance is known to be improved by using stereo/binaural microphone and microphone array which provides spatial information for separation. However, these methods still have more than 20 degree of positional ambiguities. In this paper, we further added visual information to provide more specific and accurate position information. As a result, separation capability was drastically improved. In addition, we found that the use of approximate direction information drastically improve object tracking accuracy of a simple vision system, which in turn improves performance of the auditory system. We claim that the integration of vision and auditory inputs improves performance of tasks in each perception, such as sound source separation and object tracking, by bootstrapping.
منابع مشابه
Incorporating Visual Information into Sound Source Separation
We present a method of improving sound source separation using vision. The sound source separation is an essential function to accomplish auditory scene understanding by separating a stream of sounds generated from multiple sound sources. By separating a stream of sounds, recognition process, such as speech recognition, can simply work on a single stream, not mixed sound of several speakers. Th...
متن کاملRobotic Sound Source Separation using Independent Vector Analysis
Beside haptic and vision, mobile robotic platforms are equipped with audition in order to autonomously navigate and interact with their environment. Speaker and speech recognition as well as the recognition of different kind of sounds are vital tasks for human robot interaction. In situations where more than one sound source is active, the mixture has to be separated before being passed to the ...
متن کاملInteractive User-Feedback for Sound Source Separation
Copyright is held by the author/owner(s). IUI’13, March 19–22, 2012, Santa Monica, California, USA. This work was performed while interning at Adobe Research. Abstract Machine learning techniques used for single-channel sound source separation currently offer no mechanism for user-feedback to improve upon poor results and typically require isolated training data to perform separation. To overco...
متن کاملBlind Separation of Real World Audio Signals Using Overdetermined Mixtures
We discuss the advantages of using overdetermined mixtures to improve upon blind source separation algorithms that are designed to extract sound sources from acoustic mixtures. A study of the nature of room impulse responses helps us choose an adaptive lter architecture. We use ideal inverses of acquired room impulse responses to compare the eeectiveness of diierent-sized separating lter conngu...
متن کاملSound Source Separation: Preprocessing for Hearing Aids and Structured Audio Coding
In this paper we consider the problem of separating different sound sources in multichannel audio signals. Different approaches to the problem of Blind Source Separation (BSS), e.g. the Independent Component Analysis (ICA) originally proposed by Herault and Jutten, and extensions to this including delays, work fine for artificially mixed signals. However the quality of the separated signals is ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999